Skip to main content

Feature Search Patterns Guide

This guide explains the pattern matching system used for feature searching in the Excel import process. Feature search patterns allow you to identify and extract specific data based on cell formatting, content, and location.

Pattern Syntax

Feature patterns use a special syntax that combines multiple criteria using asterisks (*) as separators:

FEATURE_TYPE*FORMAT_CODE*POSITION*COLUMN

Components

  1. FEATURE_TYPE: The type of feature to search for (e.g., FONT_COLOR, BORDER_RIGHT)
  2. FORMAT_CODE: The specific format to match (e.g., FF000000 for black color)
  3. POSITION: The relative position to search (e.g., FIRST, LAST, or numeric index)
  4. COLUMN: The column index to search in (optional)

Feature Types

  • FONT_COLOR: Matches text color
  • FONT_BOLD: Matches bold text
  • FONT_ITALIC: Matches italic text
  • FONT_SIZE: Matches specific font sizes

Border Features

  • BORDER_RIGHT: Matches right border properties
  • BORDER_LEFT: Matches left border properties
  • BORDER_TOP: Matches top border properties
  • BORDER_BOTTOM: Matches bottom border properties

Cell Features

  • BACKGROUND_COLOR: Matches cell background color
  • CELL_TYPE: Matches cell data type
  • CELL_FORMAT: Matches cell format codes

Pattern Examples

Basic Patterns

{
"FONT_COLOR*FF000000*FIRST*15": "Black text in first row, column 15",
"BORDER_RIGHT*THIN": "Thin right border anywhere",
"BACKGROUND_COLOR*FFFFFF*LAST": "White background in last row"
}

Complex Patterns

Multiple patterns can be combined using search types:

{
"TYPE": "OR_ARRAY",
"FEATURES": [
"FONT_COLOR*FF000000*FIRST*15",
"FONT_COLOR*FF000000*FIRST*17",
"FONT_COLOR*FF000000*FIRST*19"
]
}

Search Types

Used for simple feature matching:

{
"TYPE": "CONDITION",
"FEATURE": "TopBorder.Thin"
}

Matches if any pattern matches:

{
"TYPE": "OR",
"FEATURES": [
"FONT_COLOR*FF000000*FIRST*14",
"FONT_COLOR*FF000000*2*14"
]
}

Requires all patterns to match:

{
"TYPE": "AND",
"FEATURES": [
"BORDER_RIGHT*THIN",
"FONT_BOLD*TRUE"
]
}

Special configuration for searching labeled features:

"FeatureLabelSearch": {
"FRANJAS": {
"SOURCE_FEATURE": "PAGE_SPLIT_TYPE",
"TYPE": "OR",
"ROW_THEN_COLUMN": "true",
"FEATURES": "FONT_COLOR*FFFF0000*FIRST",
"MAX_COLUMN": 10
}
}

Format Codes

Color Codes

  • FF000000: Black
  • FFFF0000: Red
  • FF0000FF: Blue
  • FF00FF00: Green
  • FFFFFF00: Yellow

Border Styles

  • THIN: Thin border
  • MEDIUM: Medium border
  • THICK: Thick border
  • DOUBLE: Double line border

Search Locations

Position Specifiers

  • FIRST: First occurrence
  • LAST: Last occurrence
  • Numeric values (1, 2, 3, etc.): Specific occurrence

Direction Control

  • ROW_THEN_COLUMN: Search row-wise first
  • COLUMN_THEN_ROW: Search column-wise first

Best Practices

  1. Pattern Organization

    • Group related patterns together
    • Use meaningful names for pattern groups
    • Document complex pattern combinations
  2. Performance Optimization

    • Use specific column references when possible
    • Limit search ranges where appropriate
    • Combine related patterns using OR/AND operations
  3. Maintainability

    • Use ReusableFormats for common patterns
    • Document color codes and special formats
    • Keep pattern structure consistent

Common Pattern Use Cases

Header Detection

{
"TYPE": "AND",
"FEATURES": [
"FONT_BOLD*TRUE*FIRST",
"BACKGROUND_COLOR*FF000000*FIRST"
]
}

Data Region Borders

{
"TYPE": "OR",
"FEATURES": [
"BORDER_RIGHT*THIN",
"BORDER_LEFT*THIN"
]
}

Special Cell Formatting

{
"TYPE": "AND",
"FEATURES": [
"FONT_COLOR*FF0000FF",
"FONT_BOLD*TRUE"
]
}

Troubleshooting

Common Issues

  1. Pattern Not Matching

    • Verify color codes are correct
    • Check position specifiers
    • Confirm column numbers
  2. Multiple Matches

    • Use more specific patterns
    • Add position constraints
    • Consider using AND conditions
  3. Performance Issues

    • Limit search ranges
    • Use specific column references
    • Optimize pattern combinations

Pattern Testing

Test patterns incrementally:

  1. Start with basic patterns
  2. Add complexity gradually
  3. Verify each component separately
  4. Combine patterns carefully